bug(backend): Pipeline Version Link Returns "Cannot retrieve pipeline…#582
bug(backend): Pipeline Version Link Returns "Cannot retrieve pipeline…#582ederign wants to merge 3 commits intokubeflow:mainfrom
Conversation
… version" Error Fix race condition in upload_pipeline() that caused "Cannot retrieve pipeline version" errors when creating new pipelines. The previous code created a pipeline (which auto-creates a default version), then created a second version with a random name, then tried to delete the default version by sorting versions by created_at timestamp. If both versions had the same timestamp (common with second-precision timestamps), the sort order was undefined and could delete the wrong version - the one we wanted to keep. The returned version ID then pointed to a deleted version. Simplified the flow to just use the default version that upload_pipeline() creates, eliminating the race condition entirely: - New pipelines: use the default version created by upload_pipeline() - Existing pipelines: upload a new version with random name (unchanged) Signed-off-by: Eder Ignatowicz <ignatowicz@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@StefanoFioravanzo do you recall why the 'upload_pipeline' has such complex logic? |
|
We saw this sometimes bu we were never able to consistently reproduce this! I do believe this may be related: #522 |
jesuino
left a comment
There was a problem hiding this comment.
I did test starting multiple versions or multiple new pipelines and never had an error with this change, hence I do believe it is working as expected.
|
New changes are detected. LGTM label has been removed. |
|
@StefanoFioravanzo when you have a chance to review this! |
Summary
Fixes race condition in
upload_pipeline()that caused "Cannot retrieve pipeline version" errors when creating new pipelines.upload_pipeline()directlyKapture.2026-02-01.at.11.30.11.mp4
Fixes #581
Root Cause
The previous code created a pipeline (which auto-creates a default version), then created a second version with a random name, then tried to delete the default version by sorting versions by
created_attimestamp.If both versions had the same timestamp (common with second-precision timestamps), the sort order was undefined and could delete the wrong version - the one we wanted to keep. The returned version ID then pointed to a deleted version, causing the "Cannot retrieve pipeline version" error.
Test Plan